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In this paper we suggest a theoretical method based on the statistical mechanics 
for treating the a-helix<->random coil transition in alanine polypeptides. We consider 
this process as a first-order phase transition and develop a theory which is free of 
model parameters and is based solely on fundamental physical principles. It describes 
essential thermodynamical properties of the system such as heat capacity, the phase 
transition temperature and others from the analysis of the polypeptide potential 
energy surface calculated as a function of two dihedral angles, responsible for the 
polypeptide twisting. The suggested theory is general and with some modification 
can be applied for the description of phase transitions in other complex molecular 
systems (e.g. proteins, DNA, nanotubes, atomic clusters, fullerenes). 



I. INTRODUCTION 



The phase transitions in finite complex molecular systems, i.e. the transition from a 



stable 3D molecular structure to a random coil state or vice versa (also known as (un 



process), has a long standing history of investigation (for review see, e.g. [l[ 0, y, I4 ] ) . 



folding 



The 



phase transitions of this or similar nature occur or can be expected in many different complex 
molecular systems and in nano objects, such as polypeptides, proteins, polymers, DNA, 
fullerenes, nanotubes j^]. They can be understood as first order phase transitions, which 
are characterized by rapid growth of the system's internal energy at a certain temperature. 
As a result, the heat capacity of the system as a function of temperature acquires a sharp 
maximum at the temperature of the phase transition. 

In our recent paper 6| a novel ab initio theoretical method for the description of phase 
transitions in the mentioned molecular systems has been suggested. In particular, it was 
demonstrated that in polypeptides (chains of amino acids) one can identify specific, so- 
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called twisting degrees of freedom responsible for the folding dynamics of amino acid chains, 
i.e. for the transition from a random coil state of the chain to its a-helix structure. The 
twisting degrees of freedom are also sometimes referred as the torsion degrees of freedom. 
The essential domain of the potential energy surface of polypeptides with respect to these 
twisting degrees of freedom can be calculated and thoroughly analyzed on the basis of ab 
initio methods such as density functional theory (DFT) or Hartree-Fock method. It was 
shown 6] that this knowledge is sufficient for the construction of the partition function of a 
polypeptide chain and thus for the development of its complete thermodynamic description, 
which includes the calculation of all essential thermodynamic variables and characteristics, 
e.g. free energy, heat capacity, phase transition temperature, etc. The method has been 
proved to be applicable for the description of the phase transition in polyalanine chains 
of different lengths by the comparison of the theory predictions with the results of several 
independent experiments and of molecular dynamics simulations. Similar descriptions can 
be developed for a large variety of complex molecular systems. 

iarlier studies of the folding process based on the statistical mechanics principles (see 



0, y, y, 



101 ]) always contained some empirical parameters and thus could hardly be used 



for ab initio predictions of essential characteristics of the phase transitions. Since then, the 
total number of papers devoted to this problem is very large. Here we do not intend to 
review all of them, but refer in this article only to those, which are related directly to our 
work (for review see also and references therein). 

The first theoretical attempt to describe the folding process of polypeptides was done 
by Zimm and Bragg [?]]. In their work the process of polypeptide a- helix formation was 
considered within the framework of simple two-state statistical model. This model contains 
three principal parameters: (i) a constant describing the probability of an amino acid to 
bond in the helix conformation to a part of the chain being in the helical form, (ii) a special 
correction factor for the initiation of helix formation (i.e. a factor describing the probability 
of an amino acid to bond in the helix conformation to an amino acid that is in the random 
coil state), and (iii) the minimum number of amino acids allowed to exist in the random coil 
state between two helical parts. 

A different set of parameters was suggested in 8|. The major parameters used in that 
paper are the energies of hydrogen bonds in the polypeptide chain and the number of possible 
conformations in the random coil state. These two parameters define the energy and entropy 
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differences between folded and unfolded states of the polypeptide. In 10| the factors affecting 
the stability of polypeptide structures in solution were discussed. 

In js] the partition function of a polypeptide chain was determined as a function of 
generalized coordinates corresponding to the twisting degrees of freedom of the molecule's 
backbone. In that paper the conditional probabilities of the occurrence of helical and coil 
states of the peptide units are obtained in the form of a 3 x 3 matrix. The eigenvalues of this 



ymerization, 
9| contained 



matrix yield the various molecular averages as functions of the degree of po 
temperature, and molecular constants. The theoretical model suggested in 
three parameters which describe the statistical weights of three possible states of an amino 
acid in a polypeptide chain: the helix state, the coil state and the boundary state occurring 
at the interface between the helix and the coil phases. 



In 



111 ] another method was suggested for the derivation of the partition function of 



linear-chain molecules. The partition function was constructed on the basis of the so-called 
defining sequences, being a sequence of numbers that describe the lengths of the polypeptide 
parts found in different conformational states. Therefore the defining sequence describes a 
certain microstate of the system. The partition function of the system was constructed from 
the partition functions of the defining sequences. To do so, some special functions were 
introduced, which are called as the sequence-generating functions. The method suggested in 
11 ] was used in 3] for the study of helix-coil transition in polypeptides. In that paper the 



conditions for the occurrence of phase transition in one dimensional system were analyzed. 
In {3] the kinetics of helix-coil transition was studied within the theoretical frameworks 
developed in 

In fbl \u\ the importance of various internal degrees of freedom in polypeptide was 
discussed. The partition function of the system was constructed within the framework of 
classical and quantum mechanics. 



The helix-coil transition of polypeptides was also studied in Refs. [la, ll2j • In those papers 
general equations of statistical physics were used to describe this transition. Those theories 
contained several parameters (such as enthalpy, entropy, free energy changes) which were 
fitted to represent results of independent experimental observations. 

The molecular dynamics (MD) approach, an alternative to using statistical physics, has 
been widely used during the last decade for studying structural transitions in polypept 
Full atomistic molecular dynamics [l8, Q, [2^] and Monte-Carlo based techniques 21 



peptide 



20. 



were used for studying alanine tripeptide [181 ]. alanine pentapeptide [19| and alanine 21 



22] . The molecular dynamics simulations were carried out within the framework 



of classical mechanics with an empirical Hamiltonian usually referred as the forcefield. The 
most popular forcefields developed during recent years are GROMOS 23| . AMBER 24j] and 
CHARMM |25|. 



During the last years molecu 



process of small proteins 



26. 



27 



ar dynamics was also widely applied for studying the folding 
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29 



30 



31| . Such simulations became possible relatively 



recently due to modern computer powers. However, it is still not feasible to perform molecu- 
lar dynamics simulations of the folding process of large proteins 11 because the characteristic 



timescale of this process varies from micro seconds to minutes 
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being several orders 



of magnitude larger than the time of possible molecular dynamics simulations. 

Another molecular dynamics approach for studying the protein folding problem was sug- 
gested in 
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35[ . In these papers the dynamics of the macromolecule was considered in the 



phase space of torsional degrees of freedom. 

Stochastic treatment of helix-coil transition in polypeptides was performed in [36I, | 



the application of correlated random walk theory for polypeptides was analyzed. In 
37J an atomistic simulation of helix formation with the stochastic difference equation was 
performed. 

The helix-coil transition of polypeptides has also been extensively studied experimentally 
38, 39, 4(3, 41]. In 38] the enthalpy change accompanying the a-helix to coil transition has 



been determined calorimetrically for a 50-residue Ac-Y(AEAAKA)gF-NH2 peptide that con- 
tains primarily alanine. The dependence of the heat capacity of the polypeptide on temper- 
ature was measured with the use of differential scanning calorimetry method. In {39), the 
experiments were performed for A 5 (A 3 RA) 3 A and MABA-A 5 -(AAARA)3-A-NH 2 alanine- 
rich peptides consisting of 21 amino acids by means of UV resonance Raman spectroscopy 
and by circular dichroism, respectively. The dependence of helicity on temperature was 
recorded. Kinetics of the helix-coil transition of 21 residue Suc-AAAAA-(AAARA)3A-NH2 
alanine based polypeptide was studied in 41| by means of infrared spectroscopy. 

Previous attempts to describe the helix-coil transition in polypeptide chains within 



the 



i y, y. 



ramework of statistical physics were based on the models suggested in the sixties 



lOj, where the general formalism for the construction of the partition function 



of polypeptides was suggested. Earlier theories always included several parameters in the 



partition function making it parameter dependent. The methods suggested in 



Q, y, s q 



were widely used 



Refs. 



or the description of the helix-coil transition in polypeptide chains (see 



42 



4a 



44 



45] ). The dependance of the thermodynamic characteristics 



of the a-helix<->random coil phase transition in polypeptides on model parameters, used 
for the partition function construction, was thoroughly analysed (see papers cited above). 
Some attempts were made to obtain these parameters from experimental observations and 



from the theoretical calculations. In 



461 ] the parameters of the Zimm and Bragg theory 



17] 



were deduced from the optical rotatory dispersion and circular dichroism measurements on 
poly(L-cystine) in water at neutral pH. 

The first attempts to evaluate the parameters of the Zimm- Bragg theory theoretically 
were performed in 44]]. In that paper a semi-empirical potential 47|, |48| was used to describe 



the conformational dynamics of the 



is similar to the modern forcefields 



po 



23 



ypeptide. The potential suggested in these papers 



24 



25| . but treats the structure of a polypeptide in 



a simplified way by neglecting some of the hydrogen atoms in the polypeptide and making 
minimal assumptions about the hybridization of atoms. The potential used in 47|, |48j] can be 
considered as one of the first (if not the first) forcefields suggested. With its use in 44J the 
parameters of the Zimm-Bragg theory were calculated and the temperature of the helix-coil 
transition in polypeptide chain was established. In that paper the partition function was 
constructed and evaluated within a matrix approach developed in {^J 

The parameters of the Zimm-Bragg theory were also calculated by means of molecular 
dynamics simulation 49]. A peptide growth simulation method was introduced, which 
allowed the generation of dynamic models of polypeptide chains in a- helix or random coil 
conformations. With this method the Zimm-Bragg parameters for helix initiation and helix 
growth have been calculated. 

In the present paper we describe an alternative theoretical approach based on the statisti- 
cal mechanics for treating the a-helix -^random coil phase transition in alanine polypeptides. 
The suggested method is a further development of the method suggested in [sj, |6|], which is 
based on the construction of a parameter-free partition function for a system experiencing 
a phase transition. All the necessary information for the construction of such a partition 
function can be calculated on the basis of ab initio DFT, combined with molecular me- 
chanics theories. Comparison of the results of this method with the results of molecular 
dynamics simulations (see following paper 50]) allows one to establish the accuracy of the 
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new approach for sufficiently large molecular systems and then to extend the description 
to the larger molecular objects, which is especially essential in those cases when molecular 
dynamics simulations are hardly possible because of computer power limitations. 

We note that the suggested method is considered as an efficient novel alternative to the 
existing theoretical approaches for the study of helix-coil transitions in polypeptides since it 
does not contain any model parameters and gives a universal recipe for the construction of the 
partition function in complex molecular systems. The partition function of the polypeptide 
is constructed based on a minimal number of assumptions about the system which are 
different from those used in earlier theories. It includes all essential physical contributions 
needed for the description of the helix-coil transition in polypeptides. Therefore the final 
expression for the partition function obtained within the framework of our theory is different 
from the ones suggested earlier. 

In this paper we present in detail the theoretical method for the study of o;-helix<->-random 
coil phase transitions in polypeptides, while in the following paper [5^ we report the results 
of numerical simulations of this process. 

II. STATISTICAL MODEL FOR THE a-HELIX^RANDOM COIL PHASE 

TRANSITION 

Let us consider a polypeptide, consisting of n amino acids. The polypeptide can be found 
in one of its numerous isomeric states that have different energies. A group of isomeric states 
with similar characteristic physical properties is called a phase state of the polypeptide. 
Thus, a regular bounded a-helix state corresponds to one phase state of the polypeptide, 
while all possible unbounded random conformations can be denoted as the random coil phase 
state. 

The phase transition is the transformation of the polypeptide from one phase state to 
another, i.e. the transition from a regular a-helix conformation to a group of unbounded 
random conformations. The characteristic structural change of alanine polypeptide experi- 
encing an a-helix<-s-random coil phase transition is shown in Fig. [TJ In this figure we show 
only one characteristic conformation of the polypeptide in the random coil state, while there 
exist about fO 30 different conformations of 21 alanine polypeptide (see 6] for more details). 

The phase transition can either be of the first or of the second order. The first order 



a-helix 



random coil 



FIG. 1: The characteristic structural change of alanine polypeptide experiencing an a- 
helix<-^random coil phase transition. 

phase transition is characterized by an abrupt change of the internal energy of the system 
with respect to its temperature. In the first order phase transition the system either absorbs 
or releases a fixed amount of energy while the heat capacity as a function of temperature has 
a pronounced peak Q]. We study the manifestation of these features for alanine polypeptide 
chains of different lengths. 



A. Hamiltonian of a polypeptide chain 

To study thermodynamic properties of the system one needs to investigate its potential 
energy surface with respect to all degrees of freedom. There are a number of different 
methods for calculating the energy of many-body systems. The most accurate approaches 
are based on solving the Schrodinger equation. These approaches are usually referred as ab 
initio methods since they involve a minimum number of assumptions about the system. 

For complex molecular systems ab initio calculations require significant computer power. 
Depending on the method, the computational cost of such calculations grows as N 2 or even 



N 8 |5lj . where N is the number of particles in the system. The size of molecular system 
which can be described using ab initio methods is therefore limited, and such methods can 
hardly be used for the description of large biological molecules or systems. 

For the description of macromolecular systems, such as polypeptides and proteins, effi- 
cient model approaches are necessary. One of the most common tools for the description of 
macromolecules is based on the so-called molecular mechanics potential, which reads as 
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i=l 



i=l 



u = J2 k i(n - r >) 2 + E w - ^°) 2 + E k " t 1 + cos ^ + *)] + 

1 i=l 

N ld N 

E^-s°) 2 + E 4e *. 



1=1 



i<3 



(1) 



Here the first four terms describe the potential energy with respect to variation of distances, 
angles, dihedral angles and improper dihedral angles between two, three and four neighboring 
atoms respectively. The last two terms describe the van der Waals and Coulomb interaction 
respectively. The summation in the first term goes over all topologically defined bonds 
in the system, in the second over all topologically defined angles, and in the third over all 
topologically defined dihedral angles and in the fourth over all topologically defined improper 
dihedral angles. The total number of bonds, angles, dihedral angles and improper dihedral 
angles are iV&, N a , and respectively. N is the total number of atoms in the system. 

in (pQ) are the stiffness parameters of the corresponding energy terms. r°, 
9® and Sf are the equilibrium values of bonds, angles and improper dihedral angles, and 
Si are the number of possible stable torsion conformations and the initial torsion phase, e^-, 
<7jj and qi are the van der Waals parameters and the charges of atoms in the system. 

Parameters kf, kf, kf, r 4 °, 6°, Sf, rii, Si, €ij, (Tij and qi are derived from experimental 
measurements of crystallographic structures, infrared spectra or on the basis of quantum 



23 



24 



25] and references therein). The 



mechanical calculations for small systems (see 
independent variables in (JTJ) are rj, 9i, (pi and Si. 

Note, that the terms corresponding to the variations of distances, angles and improper 
dihedral angles in ([T]) describe the motion of the molecule within the harmonic approximation 
which is reasonable only at low temperatures. The potential energy corresponding to torsion 
degrees of freedom is usually assumed to be periodic (see equation (JTJ)) because several 



stable con 



23 



24 



25 



brmations of the molecule with respect to these degrees of freedom are possible 



52 



53 



54. 



degrees of freedom 
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55 [. The torsion degrees of freedom are also referred as the twisting 



53 



54 



55| . The most important twisting degrees of freedom for 



the description of a helix-coil transition in polypeptides are the twisting degrees of freedom 



along the backbone of the polypept 



ide fl 



35] . These degrees of freedom are defined 



for each amino acid of the polypeptide except for the boundary ones and are described by 
two dihedral angles tfi and ijji ( see Fig- M) 




FIG. 2: Dihedral angles <p and ip used for characterization of the secondary structure of a polypep- 



tide chain. The dihedral angle \i characterizes the rotation of the side radical along the Cf — C, 
bond. 







Both angles are denned by four neighboring atoms in the polypeptide chain. The angle ipi 
is defined as the dihedral angle between the planes formed by the atoms (Cj_ x — Ni — Cf) and 
(iVj — Cf — Cj). The angle ipi is denned as the dihedral angle between the (iVj — Cf — CJ 
and (Cf — C- — iVj + i) planes. The atoms are numbered from the NH2- terminal of the 
polypeptide. The angles ifi and ipi take all possible values within the interval [—180°; 180°]. 
For the unambiguous definition the angles ifi and ipi are counted clockwise, if one looks on 
the molecule from its NHo - terminal (see Fig. [2]). This way of angle counting is the most 



commonly used 



52 



53 



54 



55 



56]. 



A Hamiltonian function of a polypeptide chain is constructed as a sum of the potential, 
kinetic and vibrational energy terms. For a polypeptide chain in a particular conformational 
state j consisting of n amino acids and iV atoms we obtain: 
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-p2 3N-6 2 

H * = m + 1 ( Ji(i)Q? + + + £ + ^ ({a;}) ' (2) 

where P, M, Iil 3 , ^1,2,3, are the momentum of the whole polypeptide, its mass, its three 
main momenta of inertia, and its rotational frequencies, pi, Xi and rrii are the momentum, 
the coordinate and the generalized mass describing the motion of the system along the i-th 
degree of freedom. U({x}) is the potential energy of the system, being the function of all 
atomic coordinates in the system. 

One can group all degrees of freedom in a polypeptide in the two classes: "stiff' and 
"soft" degrees of freedom. We call the degrees of freedom corresponding to the variation of 
bond lengths, angles and improper dihedral angles (see Fig. |2]) as "stiff', while degrees of 
freedom corresponding to the angles (fi and ipi are classified as "soft" degrees of freedom. 
The " stiff" degrees of freedom can be treated within the harmonic approximation because 
the energies needed for a noticeable change of the system structure with respect to these 
degrees of freedom are about several eV which is significantly larger than the characteristic 
thermal energy of the system at room temperature being on the order of 0.026 eV 23j, |24, 



25 



54 



55 



57|. 



The Hamiltonian of the polypeptide can be rewritten in terms of the "soft" and "stiff" 
degrees of freedom. Transforming the set of cartesian coordinates {x} to a set of generalized 
coordinates {q}, corresponding to the "soft" and "stiff" degrees of freedom one obtains: 



H > = m + 1 ( J?) ^ + ^ + + EE 9« Pj f 

i=l j=l 

+ E E 9ijPlP*+ E E 9i/^- + U({ q s }, {q h }), (3) 

«=1 j=ls+l i=l 3 +lj=l a +l 

where q s and q h are the generalized coordinates corresponding to the "soft" and "stiff" 
degrees of freedom, and p s and p h are the corresponding generalized momenta. l s and lh is 
the number of the "soft" and "stiff" degrees of freedom in the system, satisfying the relation 
3iV - 6 = l s + l h . U({q s }, {q h }) in Eq. © is the potential energy of the system function 
of the "soft" and "stiff" degrees of freedom. 1/gij has a meaning of the generalized mass, 
while gij is defined as follows: 
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3N-6 , r, n 

9i-=T— ——■ (4) 

Here x\ and rri\ are the generalized coordinate in the cartesian space and the generalized 
mass of the system, corresponding to the degree of freedom with index A. qi and qj denote 
the "soft" or the "stiff' generalized coordinate in the transformed space. 

The motion of the system with respect to its "soft" and "hard" degrees of freedom 



occurs on the different time scales as was discussed in [15]. The typical oscillation frequency 
corresponding to the "soft" degrees of freedom is on the order of 100 cm -1 , while for the 
"stiff' degrees of freedom it is more than 1000 cm -1 is|. Thus the motion of the system 
with respect to the "soft" degrees of freedom is uncoupled from the motion of the system 
with respect to the "stiff" degrees of freedom. Therefore the fifth term in Eq. ([3]), which 
describes the kinetic energy of the " stiff" motions in the polypeptide can be diagonalized. 
The corresponding set of coordinates {q s } describes the normal vibration modes in the "stiff" 
subsystem: 



H >-2M + 2\ h Ql + l2 Q2 + h n *) + 2- t [-2ri + 2 ) 

Is Is S S 

+ E E a/-f- + u({x}) + u({<p, ^}). (5) 
i=i j=i 

Here u>i and ^ are the frequency of the i-th "stiff" normal vibrational mode and the cor- 
responding generalized mass. Note, that the fourth term in Eq. (Ill) vanishes if the "soft" 
and the "stiff" degrees of freedom are uncoupled. The last two terms in Eq. (jSJ) describe the 
potential energy of the system in respect to the " soft" degrees of freedom. For every amino 
acid there are at least two "soft" degrees of freedom, corresponding to the angles ifi and 
ipi (see Fig. [2]). Some additional "soft" degrees of freedom involve the rotation of the side 
radicals in amino acids. A typical example is the angle \ii which describes the twisting of 
the side chain radical along the Cf — Cf bond (see Fig. [2]). The angle Xi is defined as the 
dihedral angle between the planes formed by the atoms {C\ — Cf — Cf ) and by the bonds 
Cf — Cf and Cf — Hq. Note, that the notations x, <p and if) are used for the simplicity and 
for the further explanation of our theory. The set of these dihedral angles builds up the set 
of "soft" degrees of freedom of the polypeptide: {q s } = {%, (p, if)}. 
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Note that generalized masses l/<7y depend on the choice of the generalized coordinates 
in the system. However this dependence can be neglected if the system is considered in the 
vicinity of its equilibrium state. In this case the motion of the polypeptide with respect 
to the "soft" degrees of freedom can be considered as the motion of the system of coupled 
nonlinear oscillators. In the vicinity of the system's equilibrium state the generalized mass 
can be written as: 



9ij #j ({?&}) 6? dq k 



{4 - <4) . (6) 

9fc=9*n 



U JJ k= i 

where denotes the value of the fc-th " soft" degree of freedom at the equilibrium position. 
The second term in Eq. ([6]) describes the dependence of the generalized mass on coordinates 
and can be neglected if the system is in the vicinity of its equilibrium. All the information 
about the nonlinearity of the oscillations is contained in the potential energy functions 
U({ X }) and U{{ip^}) in Eq. ©. 

The validity of the coordinate-independent mass approximation was also discussed in 



Ref. 151 ]. In the present paper we do not account for the coordinate dependence of the 



generalized masses, g^, and leave this question open for further investigation. 

B. Partition function 

The partition function of the polypeptide is constructed within the framework of classical 
mechanics. We consider the classical partition function because in our following paper [|3(3] 
we have treated the polypeptide classically. However the presented formalism can be easily 
generalized for the quantum mechanical description of the system. 

All thermodynamic properties of a system are determined by its partition function, which 



can be expressed via the system's Hamiltonian in the following form 



58 



Z = / exp ( J dr, (7) 



kT J 

where H is the Hamiltonian of the system, k and T are the Boltzmann constant and the 
temperature respectively and dT is an element of the phase space. Substituting (jSJ) into 
(EJ) one obtains an expression for the partition function of a polypeptide in a particular 
conformational state j. Thus, the partition function of the system can be factored as follows: 
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(2ttH) 

where 



^ ~~ ~ — ^\3N ^ 1 ' Z>2 ■ Z3 ■ Z4 ■ Z5, (8) 



cxp 



kT 



M\ 



Ml 



2M 



2I{ 



U) 



21 



U) 



64tt 5 ^M 3/: 



I<Plfhf\kTf 



Ml 



21 



U) 



d 3 P ■ d 3 g • d 3 M ■ d 3 $ 



(9) 



Z 2 = exp 



1 f fM + ^M^ dV . dV .(^, (10) 



= / exp £ fr) d v = vftxkT) is n (ii) 



1=1 



Z 4 = / exp ( ^ 1 d^x s , (12) 



( W / ex P(Tl d V-dV. (is) 



(2vrft) ( 

Zi, Eq. describes the contribution to the partition function originating from the motion 
of the polypeptide as a rigid body. Here Vj is the specific volume of the polypeptide in 
conformational state j and M is the angular momenta of the polypeptide. Z 2 , Eq. (TTOl) . 
accounts for the "stiff" degrees of freedom in the polypeptide. Z 3 , Eq. ffTTj) . describes the 
contribution of the kinetic energy of the "soft" degrees of freedom to the partition function. 
Z 4 , Eq. (fl2j) . and Z 5 , Eq. (fl3j) . describe the contribution of the potential energy of the 
"soft" degrees of freedom to the partition function. Integrating over the phase space in 
Eqs. (l9l)-(fT3l) is performed over generalized coordinates and momentum space. 

For the derivation of Eqs. (Illl) - (fl3"l) we have diagonalized the quadratic form of the gen- 
eralized momenta corresponding to the "soft" degrees of freedom in Eq. (JSj) and made a 
transformation ql — > qf, p\ — > p\. In Eq. ffTTl) . (/,* is the generalized mass of the i-ih "soft" 
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normal vibration mode, being related to gij in Eq. (@J. x$ an d 4> in Eqs. ( fT2l) - (fT3l) denote 
the " soft" twisting degrees of freedom, which have been transformed accordingly. Note that 
ql and p\ are canonical conjugated coordinates. l x , l v and in Eqs. f fT2|) - fTl3|) is the number 
of the x, ip and ip degrees of freedom in the system. Note, that l s = l x + l v + l^. 



Integrals in Eqs. (|9i)-( |TTi) can be evaluated analytically, while for the integration over the 
angles \i ¥ an d ip in Eqs. (fT2l) -( fl3l) the knowledge of the exact potential energy surface of the 
polypeptide is necessary. However the potential energy of the polypeptide corresponding to 
the twisting degrees of freedom \ does not depend on the conformation of the polypeptide in 
case of neutral non-polar radicals in simple amino acids (i.e. alanine, glycine) 15j. Thus, the 
twisting degrees of freedom corresponding to the variations of angles x have a minor influence 
on the a-helix<->random coil phase transition. The potential energy of the polypeptide in 
respect to these degrees of freedom is well described by the following function, as follows 
from the molecular mechanics potential Eq. ([T|): 



tf(xi)=A; Xi [l + cos(3xi)], (14) 

where k Xi is the stiffness parameter of the potential. Since k Xi = k x , substituting Eq. (ffj 
into Eq. (TT2]) and integrating over 2ir one obtains: 



(2%) l xB(kT), 



(15) 



where Iq(^) is the the modified Bessel function of the first kind, and B(kT) 



Substituting Z±-Z 5 into Eq. ([8]) one obtains the expression for the partition function of a 
polypeptide in a particular conformational state j: 



1 1, 



B{kT) ■ (kT) 



3N-3- 



kT d<fi! . . . d(p n dip! . . . dlp n 



Aj ■ B(kT) ■ (kT) 



u({ip,i>}) 

e >=t d<fi . . . dip n dip! . . . dtp n , (16) 



Aj denotes the factor in the square brackets. Note, that generalized masses ^ are reduced 
during the integration and do not enter into the expression of the partition function. 
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Since a polypeptide exist in different conformational states, one needs to sum over the 
contributions of all possible conformations Zj in order to calculate the complete partition 
function of the polypeptide. For an ensemble of H noninteracting polypeptides the partition 
function reads as 




Z = = [B(kT).(kTr-^J2 A i 

J'=l 

U({cp,ip}) \ 

e k T d(^i . . . &ip n d^i . . . dijj n J , (17) 

/ 

where Zj is defined in (1161) and £ is the total number of possible conformations in a polypep- 
tide. Equation (fTTI) has been derived with a minimum number of assumptions about the 
system. It is general, however, its use for a particular molecular systems is not so straight- 
forward. Expression ( [171) can be further simplified, if one makes additional assumptions 
about the structure of the system. 

For the sake of simplicity, we write further equations for only one polypeptide instead of 
J\f. Generalization for the case of M statistically independent polypeptides can always be 
done according to (fT7|) . 

One can expect that the factors A, in (1TT1) depend on the chosen conformation of the 
polypeptide. However, due to the fact that the values of specific volumes, momenta of inertia 
and frequencies of normal vibration modes of the polypeptide in different conformations are 
expected to be close 0, [l^J, the values of Aj in all these conformations can be considered as 
equal, at least in the zero order approximation. Thus Aj = A. 

The amino acids can be treated as statistically independent in any conformation of the 
polypeptide. This fact is not obvious and it was not systematically investigated so far. 
The statistical independence of small neutral non-polar amino acids (alanine, glycine, etc) 
in a polypeptide was studied in 56| with the use of time-correlation functions between 
different amino acids. In our following paper 50|, we address this question for alanine 
polypeptides and determine the degree to which amino acids in the polypeptide can be 
treated as statistically independent. 

With the assumptions made, the partition function of polypeptide reduces to: 



16 



T) 3N ^YU r f exp f-i^ ) 



kT 

j=i i=i " -" - -" \ 

where e\ (<f, ip) is the potential energy of z-th amino acid in the polypeptide, being in one 
of its £ conformations denoted with j. The potential energy of the amino acid is calculated 
as a function of its twisting degrees of freedom ip and ip. 

In equation (JTSJ) the partition function is summed over all conformations of the polypep- 
tide. However, in the case of the a-helix to random coil transition of the polypeptide, the 
summation over the polypeptide conformations has to be performed only over the confor- 
mations involved in the transition. 

Note that Eq. ( Tl8l) is rather general and can be used for the description of the folding 
process in proteins. Indeed, the partition function in Eq. fll8p is determined by the potential 
energy surfaces of amino acid in the native state of a protein and in the random coil con- 
formation. The potential energy surfaces can be calculated on the basis of ab initio DFT, 
combined with molecular mechanics theories as demonstrated in 0, f| and in the following 
paper [5^]. For a protein, which has 20 different amino acids it is necessary to calculate 
at least 40 different potential energy surfaces, while for the study of folding of polypep- 
tide consisting of the identical amino acids a single potential energy surface describes the 
transition. 

Further simplifications of the partition function (fl8|) for polypeptide consisting of the 
identical amino acids can be achieved if one assumes that each amino acid in the polypeptide 
can occupy two states only, below referred as the bounded and unbounded states. The amino 
acid is considered to be in the bounded state when it forms one hydrogen bond with the 
neighboring amino acids. In the unbounded state amino acids do not have hydrogen bonds. 
When the a-helix is formed, all amino acids are in the bounded state, while in the case of 
random coil all amino acids occupy the unbounded states. 

All possible conformations of the polypeptide experiencing in the course of the a- 
helix^random coil phase transition can be divided in three different groups: 

I. completely folded state of the polypeptide (a-helix), in which all the amino acids 
occupy bounded states. 

II. partially folded states of the polypeptide (phase co-existence), in which the core of A 
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amino acids of the polypeptide occupy bounded states, and n — A boundary amino 
acids are in unbounded states. 

III. completely unfolded state of a polypeptide (random coil), in which all the amino acids 
are in unbounded states. 

IV. phase mixing, in which two or more fragments of a polypeptide are in an a-helix state, 
while the amino acids between the fragments are in the random coil state. 

With the assumptions outlined above and assuming the polypeptide to consist of n iden- 
tical amino acids the partition function ffl8l) of the system can be rewritten as follows: 



Z = A ■ B(kT) ■ (kT) 3N ~ 3 -^ 



n— 4 



i=l 



(19) 



, ( pi ST (fc - l)\(n - k - 3)! k+3i „ n _ k _ 3i 

£l ^ rt{i-W-{k-i)\{n-k-i-?>)\ b 



Here the first and the third terms in the square brackets describe the partition function of 
the polypeptide in the a-helix and in the random coil phases respectively, while the second 
term in the square brackets accounts for situation of the phase co-existence. The summation 
in the second term in f[T9~j) is performed up to n — 4, because the shortest a-helix consists of 4 
amino acids. The last term in the square brackets accounts for the polypeptide conformations 
in which a number of amino acids being in the helix conformation are separated by amino 
acids being in the random coil conformation. The first summation in this term goes over 
the separated helical fragments of the polypeptide, while the second summation goes over 
individual amino acids in the corresponding fragment. Polypeptide conformations with 
two or more helical fragments are energetically unfavorable. This fact is discussed in our 
following paper [so| . As shown in the following paper [5^ the contribution to the partition 
function represented by the fourth term in the square brackets in Eq. (Tl9~l) is significantly 
small when compared to the first three terms, for polypeptides containing less than 100 of 
amino acids. Therefore, it can be omitted in the construction of the partition function. Z^ 
and Z u are the contributions to the partition function from a single amino acid being in the 
bounded or unbounded states respectively, they read as: 
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(u) 



kT 



Z u = J J exp I gf^J dyxty (21) 

where e®((p,ij)) and e"(^3,^) are the potential energies of a single amino acid being in 
the bounded or in the unbounded states respectively calculated versus the twisting degrees 
of freedom </> and ip. (3 is a factor accounting for the entropy loss of the helix initiation. 
Substituting (1201) . (12 ID and (I22H into equation (Tl9l) one obtains the final expression for the 
partition function of polypeptide undergoing an a-helix^->random coil phase transition. This 
result can be used for the evaluation of all thermodynamical characteristics of the system. 

e^(ip, iji) and e^ u '(ip, ip) determine the partition function of polypeptide. These quantities 
can be calculated on the basis of ab initio DFT, combined with molecular mechanics theories 
as demonstrated in 

aa 

and in the following paper |50l |. 



III. THERMODYNAMICAL CHARACTERISTICS OF A POLYPEPTIDE 

CHAIN 

The first order phase transition is characterized by an abrupt change of the internal 
energy of the system with respect to its temperature. In the first order phase transition the 



system either absorbs or releases a 
of temperature has a sharp peak 



ixed amount of energy while heat capacity as a function 



5s 



(see Fig. E]). 

The peak in the heat capacity is characterized by the transition temperature T , the 
maximal value of the heat capacity Co, the temperature range of the phase transition AW 
and the specific heat Q, which is also referred as the latent heat of the phase transition (see 
Fig. ED. 

All these quantities can be calculated if the dependence of the heat capacity on tempera- 
ture is known. The temperature dependence of the heat capacity is defined by the partition 
function as follows 



(23) 
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Temperature 

FIG. 3: Temperature dependence of the heat capacity for a system experiencing a phase transition. 
The characteristics of the phase transition are determined by the following equations: 



dC(T) 



dT T __ To =° (24) 
C = C(T ) (25) 

C(T ± AW) = ^ (26) 

POO 

Q= C(T)dT. (27) 
Jo 

Unfortunately it is not possible to obtain analytical expressions for To, Co, AW and Q with 
partition function defined in ([191) because the integrals in (|20|) and (12~T1) can not be treated 
analytically. However, the qualitative behavior of these quantities can be understood if one 
assumes that all conformational states of a polypeptide in a certain phase have the same 



energy. This model is usually referred to in literature as the two-energy- level model 
and it turns out to be very useful for the qualitative analysis of the phase transitions in 
polypeptide chains. If one considers the phase transition between two such phases, the 
partition function can then be constructed as follows: 
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(28) 

where Zo is the partition function of the system in the first phase, AE = E 2 — E\ is the 
energy difference between the states of the polypeptide in two different phases, 7/1 and 772 
are the numbers of isomeric states of the polypeptide in the first and in the second phases 
respectively. They can also be considered as the population of the two phases. A = A 2 / 'A\ 
is the coefficient depending on masses, specific volumes, normal vibration modes frequencies 
and momenta of inertia of the polypeptide in the two phases. Substituting equation ([28]) 
into equation (1231) one obtains the expression for the heat capacity in the framework of the 
two-energy-level model: 

A^AE 2 e~(^) 

C ^ = f (29) 

kT 2 (l + A&e-M) 

Substituting equation (1291) into equations (!24l) - (j27j) and solving them one obtains the ex- 
pressions for T , C , AW and Q, which read as: 



Z « Z 



A— e kT 



To 

C 
AW 

Q 



AE 

Am 



Viy J 



64 In 2 



AE 

2 

AE 



AS 2 
4k ' 



71 



In [ A^ 



64 In 2 kAE 

7T AS 2 ' 



C{T)dT = AE. 



(30) 

(31) 
(32) 

(33) 



Here AS = k In Ar] 2 — klnrji is the entropy change in the system and M is the mass of a single 
polypeptide. AS* and AE are the major thermodynamical parameters in the considered 
problem, since they determine the behavior of the phase transition characteristics. From 



equations (I3"U1) - (I3"2"1) follows, that T 



AE 
AS' 



Co ~ AS 2 , Q ~ AE and AW 



AE 
AS 2 • 



The numerical calculation and analysis of various thermodynamical characteristics such 
as the latent heat or the heat capacity is done in the following paper 501 ] . 
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IV. CONCLUSION 



In the present paper a novel ab initio theoretical method for treating the a-helix<->random 
coil phase transition in polypeptide chains is introduced. The suggested method is based 
on the construction of a parameter-free partition function for a system undergoing a first 
order phase transition. All the necessary information for the construction of such a partition 
function can be calculated on the basis of ab initio DFT, combined with molecular mechanics 
theories (see results of — stations in the foi.owing paper Q). 

The suggested method is considered as an efficient alternative to the existing theoretical 
approaches for the study of helix-coil transition in polypeptides since it does not contain any 
model parameters. It gives a universal recipe for statistical mechanics description of complex 
molecular systems. The partition function of polypeptide is written with a minimum number 
of assumptions about the system which makes our method much more general and universal 
in comparison with other theoretical approaches. 

In the present paper we introduced novel theoretical method for the study of a- 



helixv->random coil phase transition in polypeptides. In the following paper [50|] we report 
the results of numerical simulations of this process obtained within the framework of the 
suggested model. 
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